Coreference Resolution in Research Papers from Multiple Domains

نویسندگان

چکیده

Coreference resolution is essential for automatic text understanding to facilitate high-level information retrieval tasks such as summarisation or question answering. Previous work indicates that the performance of state-of-the-art approaches (e.g. based on BERT) noticeably declines when applied scientific papers. In this paper, we investigate task coreference in research papers and subsequent knowledge graph population. We present following contributions: (1) annotate a corpus comprises 10 different disciplines from Science, Technology, Medicine (STM); (2) propose transfer learning papers; (3) analyse impact (KG) population; (4) release KG automatically populated 55,485 STM domains. Comprehensive experiments show usefulness proposed approach. Our approach considerably outperforms baselines our with an F1 score 61.4 (+11.0), while evaluation against gold standard shows improves quality significantly 63.5 (+21.8).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Corpus for Coreference Resolution on Scientific Papers

The ever-growing number of published scientific papers prompts the need for automatic knowledge extraction to help scientists keep up with the state-of-the-art in their respective fields. To construct a good knowledge extraction system, annotated corpora in the scientific domain are required to train machine learning models. As described in this paper, we have constructed an annotated corpus fo...

متن کامل

Corpus based coreference resolution for Farsi text

"Coreference resolution" or "finding all expressions that refer to the same entity" in a text, is one of the important requirements in natural language processing. Two words are coreference when both refer to a single entity in the text or the real world. So the main task of coreference resolution systems is to identify terms that refer to a unique entity. A coreference resolution tool could be...

متن کامل

Reconcile: A Coreference Resolution Research Platform

We have created a software infrastructure called Reconcile that is a platform for the development of learning-based noun phrase (NP) coreference resolution systems. Reconcile’s architecture was designed to facilitate the rapid creation of coreference resolutions systems, easy implementation of new feature sets and approaches to coreference resolution, and empirical evaluation of coreference res...

متن کامل

Coreference Resolution

Overview Coreference resolution refers to the task of clustering different mentions referring to the same entity. This is particularly useful in other NLP tasks, including retrieving information about specific named entities, machine translation, among others. In this report, we discuss our approach, implementation and observations for a few baseline systems, a rule-based system, and a classifi...

متن کامل

Towards Multiple Antecedent Coreference Resolution in Specialized Discourse

Despite the popularity of coreference resolution as a research topic, the overwhelming majority of the work in this area focused so far on single antecedence coreference only. Multiple antecedent coreference (MAC) has been largely neglected. This can be explained by the scarcity of the phenomenon of MAC in generic discourse. However, in specialized discourse such as patents, MAC is very dominan...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Lecture Notes in Computer Science

سال: 2021

ISSN: ['1611-3349', '0302-9743']

DOI: https://doi.org/10.1007/978-3-030-72113-8_6